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ABSTRACT: Early Warning Systems (EWSs) aggregate multiple sources of data to provide timely 
information to stakeholders about students in need of academic support. There is an increasing 
need to incorporate relevant data about student behaviours into the algorithms underlying EWSs 
to improve predictors of student success or failure. Many EWSs currently incorporate counts of 
course resource use, although these measures provide no information about which resources 
students are using. We use seven years of data from seven core STEM courses at a large 
university to investigate the associations between student use of categorized course resources 
(e.g., lecture or exam preparation resources) and their final course grade. Using logistic 
regression, we find that students who use exam preparation resources to a greater degree than 
their peers are more likely to receive a final grade of B or higher. In contrast, students who use 
more lecture-related resources than their peers are less likely to receive a final grade of B or 
higher. We discuss the implications of our results for developers deciding how to incorporate 
categories of course resource usage data into EWSs, for academic advisors using this information 
with students, and for instructors deciding which resources to include on their LMS site. 

Keywords: Early warning systems, academic advisors, learning management systems, course 
resources, student grades 
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1 INTRODUCTION 

With the growing interest in learning analytics (LA), colleges and universities are expanding their use and 
development of systems that aggregate multiple sources of student data to produce insights into 
student behaviour related to academic success (Campbell, DeBlois, & Oblinger, 2007; Siemens & Long, 
2011; Johnson, Becker, Estrada, & Freeman, 2014). Identifying students "at risk" for failing courses 
and/or dropping out has long been an area of interest predating the term learning analytics (see 
Braxton, 2000). However, research in higher education is moving away from solely identifying the lowest 
performing students to instead using "big data" to describe and predict performance and learning 
outcomes for all students (Siemens & Long, 2011). In particular, researchers and practitioners are asking, 
"What data is actionable for all students?" Once identified, "What actions are facilitated by the 
presentation of the data?" and "How can the data be represented in a manner that can reasonably 
interpreted and acted upon to improve teaching and learning outcomes?" 

One class of these systems that have come from LA work are Early Warning Systems (EWSs), also called 
Early Alert Systems (EAS). Researchers and commercial vendors are designing EWSs to provide 
information to students, instructors, advisors, and/or other intermediaries for the purposes of quickly 
and easily identifying students in need of academic support (Beck & Davidson, 2001; Macfadyen & 
Dawson, 2010). As EWSs become an integrated tool within educational technologies, researchers and 
developers must begin to evaluate carefully the components underlying the algorithms driving these 
systems (Ferguson, 2012). 

Both researchers and commercial vendors of web-based technologies, particularly learning management 
systems (LMSs), have become particularly interested in creating new tools or system add-ons that 
collect student activity data to utilize for assessing student risk factors and providing actionable 
information. Given the consistency in the types of data generated by LMSs, similar kinds of data 
elements comprise the algorithms and categorization schemes behind EWSs (Sharkey & Ansari, 2014). 
Most systems rely primarily on student grade information and login frequency; developers typically 
display these data in relation to relative class performance. However, we are now at a point in time 
where it is possible to investigate which additional data points can further explain the variation in 
student course outcomes and thereby recognize both successful and unsuccessful individual-specific 
behaviours to provide more personalized feedback when designing interventions. 

In this paper we use seven years of data from seven core science, technology, engineering, and 
mathematics (STEM) courses to investigate how additional data generated by students' course LMS 
resource use can be incorporated into an EWS to refine its' explanatory and classification power. We 
focus on these STEM courses because they represent the main "gateway" courses in these disciplines 
where undergraduate students experience their first indication of whether or not they are likely to 
succeed in majoring in STEM coursework. Specifically, we investigated student use of the various types 
of course resources that instructors typically make available in their course website. 
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We utilized an existing EWS, called Student Explorer (Krumm, Waddington, Teasley, & Lonn, 2014), 
which draws student activity from data captured by the university's LMS to conduct our study. The 
current form of Student Explorer provides academic advisors with real-time data about student grades 
and LMS login frequency. However, advisors may be able to target their interventions more effectively if 
the EWS included data about whether or not students are accessing and using important course 
resources, such as practice exams or lecture notes. Moreover, the results of this study can provide 
important information to instructors about the course resources they include in an LMS, as well as 
inform EWS developers about the information they should include in data analyses and display via 
dashboards. 

Two primary research questions guided our work: 

(RQ1) "What is the association between students' use of four types of course LMS resources and the 
likelihood that a student receives a final course grade of A or B in a core STEM course versus a 
C?" 

(RQ2) "Are there similarities or differences in the associations between students' course resource use 
based on type and final course grades across multiple courses?" 

Understanding the association between student use of course resource types and student grades will 
shed light on whether or not particular course resource data can be important indicators of course 
performance and how these data could be incorporated into an EWS such as Student Explorer. We 
outline the current landscape of the development of EWSs within the LA field before describing Student 
Explorer. Then, we detail the course resource LMS data, methods, analyses, and results of this study. We 
conclude the paper by discussing the implications of the results in the context of general EWS 
development and give consideration to the next steps for incorporating student resource use into 
Student Explorer. 

2 CONCEPTUAL FRAMEWORK 

2.1 Early Warning Systems Research 

Early academic analytics initiatives in higher education aimed to predict which students were at risk of 
academic difficulty (Campbell & Oblinger, 2007). Recent research in this area has differentiated 
academic analytics, employing data to support operational and financial decision making, from learning 
analytics, using data to understand and optimize student learning and the environments in which it 
occurs (SoLAR, n.d.; van Barneveld, Arnold, & Campbell, 2012). Optimizing the learning environment in 
higher education, particularly across students' concurrent course loads, includes presenting patterns 
and indicators of student behaviour to intermediaries (e.g., academic advisors and coaches) who can act 
upon such information (Duval, 2011; May, George, & Prevot, 2011). Our prior work has focused on 
leveraging a learning-analytics-powered early warning system, Student Explorer, to help academic 
advisors quickly identify students in need of academic support and allow these professionals to engage 
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in sense-making activities that support subsequent actions (see Krumm et al., 2014; Lonn, Aguilar, & 
Teasley, 2015). 

Early warning systems (EWSs) use historical and formative educational data to identify students who 
might be at risk of academic failure, often doing so in near real time. To be valuable for informing 
academic interventions, Dringus (2012) argues that student data must be "measurable, visible, and 
transparent" (p. 98). Building on earlier proofs-of-concept that use such student data (e.g., Macfadyen & 
Dawson, 2010; Morris, Finnegan, & Wu, 2005), Course Signals was one of the first EWSs broadly 
deployed to use students' formative course performance, online learning management system (LMS) 
activity, prior academic history, and demographics to indicate the likelihood of academic failure to 
instructors (Arnold, 2010; Arnold & Pistilli, 2012). Jayaprakash and colleagues created the Open 
Academic Analytics Initiative (OAAI), which sought to create an open predictive model for use in EWSs 
based on Campbell's (2007) original model for Signals (Jayaprakash, Moody, Lauria, Regan, & Baron, 
2014). Testing this model first at Marist College and subsequently at four small to mid-size institutions, 
the investigators found that the model was effective in large lecture-style courses with enrollments of 
100+ students, but the "value added" for an instructor was harder to discern in smaller class sizes. 
Additionally, Jayaprakash et al. (2014) uncovered a general trend where some students improved after 
receiving one "treatment" (e.g., being contacted by the instructor based on the predictive model) and 
another group of students who did not improve regardless of the number of "treatments" received. 
Finally, the investigators indicate that EWSs that utilize models based on blended courses do not 
translate well to fully online course contexts. 

This dichotomy in EWS effectiveness depends on whether recent investigations account for course 
modality. In online contexts, particularly where LMS activity is translated into "time on task" variable 
constructs, an EWS can effectively characterize a student's current learning performance (Hu, Lo, & Shih, 
2014). Further demonstrating this contextual difference, Agudo-Peregrina and colleagues (2014) found 
an association between LMS interactions and academic performance in online courses, but not in LMS- 
supported blended courses. The LMS interaction association for the online courses examined by these 
investigators could be classified by 1) agent (student-student, student-teacher, student-system, and 
student-content), 2) frequency of LMS use, and 3) mode of use (active vs. passive). While some argue 
that time-on-task is indicative of time spent on learning and is thus of critical concern to all LA initiatives 
(Kovanovic et al., 2015), particularly for EWS implementations, such estimations can artificially smooth 
over important differences in how LMSs are used in different contexts. For example, Beer, Jones, and 
Clark (2009) found a strong association between LMS activity and course grades across five years of LMS 
data for blended courses, but instructor use was a significant mitigating factor. When the instructor's 
use of the LMS was "super low," there was no discernable association between LMS usage and grades. 

Considering this variability, we first developed Student Explorer to use the data most common across all 
LMS course websites: grades and logins. Website hit consistency (how regularly a student visits a 
website between class meetings) is an important indicator that can indicate more interest in a course 
and better time management skills (Baughter, Varanelli, & Weisbord, 2003). Student Explorer therefore 
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uses the rank percentile of students' weekly LMS website views (all page views within a course website) 
to gauge login consistency against all other students in the course. In the study described in this paper, 
we extend this work by adding another common data element, use of file resources, but do so in an 
approach that takes advantage of the available metadata for each file. Other EWSs have used counts of 
file resource uploads and downloads in their algorithms and found that such activity can explain more 
variation of individual grades than website logins (e.g., Braender & Naples, 2013). Our belief is that a 
more detailed approach that leverages file metadata can extend the utility of EWSs beyond retention 
(Arnold & Pistilli, 2012) and enrollment (Harrison, Villano, Lynch, & Chen, 2015) outcomes to provide 
avenues to maximize learning for all students. Below, we give a detailed description of the development 
of Student Explorer before describing our motivation for conducting this study and expanding the 
capabilities of this EWS. 

2.2 Development of Student Explorer 

Student Explorer is an EWS that originally provided near real-time data from the LMS at a large research 
university to support the existing work of academic advisors in the STEM (Science, Technology, 
Engineering, and Mathematics) Academy (Krumm et al., 2014). The aim of the STEM Academy is to 
increase the academic success of historically underrepresented students in STEM fields through a 
holistic student development program. Researchers and the STEM Academy's academic advisors and 
leaders developed Student Explorer through a two-year collaborative effort using principles of design- 
based research (Cobb, Confrey, diSessa, Lehrer, & Schauble, 2003; Krumm et al., 2014). Student Explorer 
now also serves staff members in Summer Bridge, probation, and general engineering advising roles 
across campus, annually tracking over 4,500 undergraduate students (Lonn et al., 2015). 

Prior to Student Explorer, advisors relied upon students' self-reported grades during face-to-face 
meetings or instructor-provided midterm progress reports. The infrequency of these meetings and 
reports, combined with the reliance on self-reported grades, did not allow advisors to intervene in as 
timely or targeted of a manner as hoped. Therefore, mentors used Student Explorer to more readily 
identify and engage students in need of academic support in discussions about their ongoing 
performance (Krumm et al., 2014). 

Student Explorer aggregates course grade and LMS site page views for each student for all of their 
courses. Academic advisors view the aggregated grade and page view data through a variety of 
visualizations, including within course comparisons of students' performance relative to their peers over 
the term. Advisors reported the most useful feature of Student Explorer is a three-level classification 
scheme that combines academic performance and page view data to highlight which students are doing 
well, having difficulty, or are in immediate need of academic support. The system also allows advisors to 
drill down in the students' grade data to view performance on individual graded elements, such as 
homework, quizzes, and exams. However, Student Explorer does not provide information about student 
use of specific course LMS resources or the potential influence of resource use on grades. See Krumm et 
al. (2014) for a detailed explanation of the design and development of Student Explorer. 
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2.3 Motivation for Including Course Resource Use in Student Explorer 

The classification scheme and information about a student's developing grade shown by Student 
Explorer are valuable components for advisors to identify quickly which students are struggling. 
However, the usefulness of displaying LMS data in an EWS goes well beyond providing a single indicator 
of student performance. Incorporating additional performance-related LMS data into the EWS would 
allow advisors or other users to intervene in a more personalized manner. 

In their current form, many EWSs rely upon "prediction" models, which combine sources of information 
about student characteristics and activity-to-date into a measure of how a student is going to do. One of 
the benefits of Student Explorer is that the system provides information of how the student is currently 
doing across an array of courses. Part of our challenge, then, is to incorporate various types of 
information related to student course performance across a diverse set of courses by using a 
straightforward data mining and modelling approach. Further, the information needs to be interpretable 
by the intended users. Few EWSs incorporate the course resources that students are "hitting" (viewing, 
downloading, saving, etc.), and those that do incorporate such information only account for overall file 
upload and/or download counts (e.g., "RioPACE," Ornelas, Ordonez & Huston, 2014). Therefore, we 
investigated the association between categories of course resources and student grades in a single 
course to first determine whether these data matter. Later, we consider how developers might 
incorporate and advisors might interpret resource usage data so that it is scalable across courses. 

When considering the need to modify the current version of Student Explorer by including information 
about student use of course resources, we focus on three primary audiences: academic advisors, course 
instructors, and designers of EWSs. For advisors, having information on resource use helps to provide 
more concrete information about what students are or are not doing as part of their work for a course. 
This provides an additional layer of information in the connection between a student's performance and 
their actual habits/activities. In addition, by understanding more about the resource-use trends 
displayed by top students in previous iterations of a given course, advisors can have a better benchmark 
when conversing with current students. That said, we are not studying the possible inclusion of course 
resource information to distinguish between students performing well; rather, we want to add this 
component so that students on the margin of receiving a desired versus an undesired final grade can 
receive more targeted support from advisors to improve their habits and academic performance. 

For instructors, investigating some of the patterns of resource use and their correlation with student 
grades yields information about the purpose and value of various resources. Earlier research on LMS use 
has shown that the most common instructor behaviour is to provide an increasing number of resources 
on their course site, often overwhelming students (Lonn, Teasley, & Krumm, 2011). Providing instructors 
with information about actual resource use along with the association between student resource use 
and course performance may allow them to more carefully consider which resources to include (and 
how they should be integrated) in the LMS. This approach is opposed to an instructor simply including 
everything that might be useful, relevant, or supplementary to the course content and assessments. This 
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information is also important for those developing LA-based systems, such as EWSs. Developers and 
researchers can rely upon data-informed decision-making when designing models or including 
information in a dashboard or other displays that place a premium on parsimony. 

3 DATA AND METHODS 

3.1 Data Description 

Academic advisors in the STEM Academy primarily focus on providing support for courses in the STEM 
fields, and first-year students receive the greatest degree of support. Therefore, we focus on seven core 
first- and second-year courses for students majoring in the STEM fields. These courses include 
Engineering (two courses), Physics (two courses), and Chemistry (three courses) courses, to which we 
have assigned course name and number pseudonyms. 1 Our analysis focuses on all students in these 
courses, not solely STEM Academy students, such that any observed association between resource use 
and grades would be reflective of the population of students in the course as opposed to a targeted 
subset that could skew the results. A brief description of each course is included in Appendix A. 

We use student final course grades as our outcome measure of student performance. One of the 
distinct goals of STEM Academy academic advisors is to help students to obtain an overall grade-point 
average (GPA) of a 3.0 or higher. The student's GPA is an average of course performance across all 
courses completed, weighted by the credit hours earned for each course (a typical course at the 
university is three credit hours). The letter grade (e.g., A, A-, B+, B, etc.) a student receives at the end of 
a course is converted to a numerical value (e.g., A=4.00, A-=3.67, B+=3.33, B=3.00, etc.) to calculate the 
grade-point average. Therefore, an overall GPA of 3.0 corresponds to an average of a "B" final grade 
across courses. 

We created two distinct outcome measures using final course grades for use in separate analyses. The 
first outcome is a dichotomous variable to indicate whether a student's final course grade was either an 
A or B versus a C. Advisors consider any A or B final course grades as "desirable" outcomes while C 
grades are "undesirable" outcomes, given the overall GPA goal of 3.0 for STEM Academy students. We 
condensed grades of A+, A, and A- into the "A" category and did the same for the "B" and "C" grade 
categories. 2 For the second outcome measure, we separate students earning A and B grades to 
distinguish differences in the associations between course resources and grades between all three 
groups of students. 


1 We considered using data from one additional core chemistry and engineering course each. However, the grade distribution in 
these courses was such that less than 2% of students received a "C" or lower final grade. 

2 A final course grade of a B- (2.67) is lower than the overall GPA target of 3.0. However, in several of the introductory STEM 
courses at the university, the average student experiences what is known as a "grade penalty" due to the difficulty of the 
course, earning lower than they would in a typical course (see Huberth, Chen, Tritz, & McKay, 2015). Therefore, we group the B- 
students with the B+ and B students throughout our analyses and descriptions of results. 
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We excluded all individuals receiving a D or F final grade in any course from our analyses for two 
reasons. First, the population of all students enrolled in STEM courses at the university (specific to this 
university context) is largely a high achieving group. There were only a small number of students 
receiving a D (poor performance, but still earning credit for the course) or F (failing the course) in each 
course (ranging from 0.1% to 6.2% of students in any one course, see Appendix Table C.l). The small 
proportions of poorly performing students makes it difficult to disentangle any differential impacts of 
course resources on the final grades for these low achieving students versus all others. Second, we 
intend for the results of this study to refine the classification model already in place for the Student 
Explorer EWS. Added information about the use of course resources will help to classify and highlight to 
advisors students that are specifically on the margin of receiving a "C" grade in the course (an 
undesirable grade) compared to a "B" grade (desirable). We could potentially observe positively biased 
estimates of the associations between course resource use and course grade if the D and F students 
"dragged down" the "C" students. For students at risk of receiving a D or F in the course, the EWS 
already alerts advisors about their poor performance by highlighting the student's academic 
performance on available exam, assignment, or other data before utilizing information about LMS 
resource use. 3 

We also excluded the few students who withdrew from the course at some point during the semester or 
received an incomplete final grade (0.9% to 4.2% of students across courses, see Appendix Table C.l). 
We remove these students from analysis for two reasons. First, these students do not receive a letter 
grade that necessarily reflects course performance (e.g., an incomplete final changed could later be 
changed to any possible letter grade, depending on performance). Second, because of receiving an 
incomplete grade in the course or withdrawing, many of these students will not have spent an 
equivalent amount of time in the course as their peers, leading to differences in exposure to the course 
resources. 

We classified course resources in the LMS by a structure that is adaptable across multiple semesters and 
courses. We initially created these categories based on data from the CHEM 101 course (see 
Waddington & Nam, 2014, where it is referred to as CFIEM 100). The course structure and resources 
used in CHEM 101 remained relatively stable across course sections and semesters. We found the same 
consistency in the additional six courses we include in this study, as well as those courses having 
categories of course resources in common with CHEM 100. As a result, we were able to classify the LMS 
course resources into distinct, replicable groups and look at the impacts of their use on a student's 
grade across multiple semesters and courses instead of relying upon one semester (when available) or 
one course-worth of data. We are thus able to draw conclusions about categories of course resources 
over time and across courses. 


3 We display results from our multinomial logistic regression analyses that includes the handful of students receiving a D or F 
final grade with C students in Appendix Table C.4. We find no discernible differences in our results from our preferred models 
where D and F students are excluded. 
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We identified four broader categories of LMS course resources through our classification phase. These 
categories include course information resources (e.g., syllabus, announcements), lecture-related 
resources (e.g., notes, discussion), assignment-related resources (e.g., problem sets, experiments), and 
exam preparation resources (e.g., practice exams). We bundled individual resources into these broader 
categories by searching across all resources based on specific keywords. While there is some degree of 
variation in the materials provided for each course and each semester, each course uses some form of 
lecture or course information materials. 4 We included the category names and examples of LMS 
resources as well as further details of our categorization process in Appendix B. 

After categorizing the resources within each course into four groups, we next constructed the measures 
of each student's use of course resources. We calculated each student's percentile rank of course 
resource use within each resource category compared to their peers' use of the same category of 
resources in the same course section. The percentile rank measures range from 1 to 99. We chose to use 
within-course percentile rank for two reasons. First, these course resource use data are highly skewed, 
so this eliminates outliers who may access materials with far greater frequency than their peers (such as 
re-opening the same resource multiple times vs. saving and downloading). Second, using a relative 
measure allows us to combine data from multiple sections and semesters into one model and then 
make comparisons across semesters within a course and between courses. 

Our data contained records of students and course resources spanning three to twelve 16-week 
semesters (depending on course), from 2007-2014. In total, there were 26,843 students enrolled across 
these seven courses over seven academic years, with 26,784 students included in our analyses that have 
a full set of valid covariate measures described in the next section. 

3.2 Estimation Strategy 

We used two versions of a logistic regression model to estimate the association between student course 
resource use on the likelihood that a student receives a certain final grade. Our first model (1) is a 
standard logistic regression model where we estimate the likelihood that a student receives an A or B 
final grade versus a C in a given course. We estimated separate models for each of the seven courses. 
Functionally, the logistic regression model takes the following form: 

In ( Pr[ p^[y^c] 51 ) = a + PiCInfoi + fcLecti + fcAssigrii + faExamPrepi + 0Xj + Y.s=i S s S t ( 1 ) 

In the above equation (1), the probability that a student i receives an A or B ("desired outcome") in a 
given course versus a C ("undesired outcome") is a function of their within-course percentile rank in the 
use of four types of course resources (Course Information, Lecture-Related, Assignment-Related, and 
Exam Preparation). We adjust the estimates of the associations between resource use and final grade by 


4 CHEM 202 is a laboratory course that is required when a student enrolls in CHEM 201. The course does not have any exams. 
PHYS 201 does not have any assignments. 
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controlling for a vector of covariates representing the student's demographic and academic background 
(Xj) along with a fixed effect for the semester in which the student was enrolled in the course (Si). 

The student demographic covariates include the student's sex. We use a dichotomous variable 
indicating the differences in final grade likelihood between females and males. Within the literature on 
performance in STEM courses (broadly and within the university), there are notable differences in the 
performances of men and women (Carrell, Page, & West, 2010; Kost, Pollock, & Finkelstein, 2009; 
Hazari, Tai, & Sadler, 2007). Because prior research has shown that international students are more 
engaged in educational activities than their American counterparts (Zhao, Kuh, & Carini, 2005), we also 
control for citizenship, including individuals who are U.S. citizens, non-citizen permanent residents, and 
non-U.S. residents. Indirectly, this measure also serves as a proxy for students for whom English is a 
second language. 

Regarding academic background, we control for a student's first-term math course placement. Prior to 
beginning their studies at the university, students take a math placement assessment that recommends 
the math course the student takes during their first term. Across the seven courses in our study, 92-99% 
of students took the placement assessment and the actual first-term math course taken matched the 
course placement suggested by the assessment near universally. We created a dichotomous indicator of 
Calculus I or higher placement (the majority of students placed at this level), versus all lower courses 
(see Table 2 for details). We included a separate indicator for the 1-7% of students in a given course 
who did not take the placement test. We believe that the first-year math placement results are a better 
indicator of student math ability in the context of STEM course performance within this specific 
university than SAT or ACT math scores because it largely determines the first math course taken. We 
also avoid having to convert between SAT and ACT scores or having to adjust these scores as their scales 
changed over time. 

We also control for each student's semester GPA, which we have recalculated after removing the final 
grade earned in the STEM course that is the subject of analysis. This measure accounts for other 
contextual factors regarding a student's academic performance during the given semester in which they 
took the course. For example, a low semester GPA suggests that a student may be struggling 
academically independent from their performance in a given STEM course. 

We included semester fixed effects in our model to account directly for the variation in the distribution 
of student grades across N course semesters of available data for each course. Using pooled data across 
semesters yields a greater degree of consistency in estimating the association between course resources 
and student grades. In doing so, it is necessary to account for differences in average student 
performance between semesters through the semester fixed effects. The fixed-effect nature of the 
semester indicators also controls for unobserved factors between semesters that might influence the 
association between resource use and student grades. These unobserved factors include semester-to- 
semester differences in average student ability as well as any instructor-related differences such as the 
quality of teaching or encouraging the use of specific resources in the LMS. By including semester fixed 
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effects, we estimate the within-semester association between a given student's course resource use and 
the likelihood of receiving a certain course grade after controlling for a host of covariates. 

In model 1, through /? 4 represent our estimates of interest and measure the association between 
each resource category and final course grade. We present our results in terms of odds-ratios, or a 
comparison of the likelihood of receiving an A or B versus a C in a given course. Thus, ^ is the within- 
semester estimate of how much a one-percentile increase in the use of course information resources 
changes the likelihood that a student receives an A or B versus a C for a final course grade. The 
interpretations are the same for the other three resource types. While these estimates do not represent 
a causal link between resource use and final grades, we have reduced the bias in the estimates by 
adjusting for some potential confounding factors within the student background measures and semester 
fixed effects. Other unobserved factors may influence the associations between resource use and final 
grades. 

In addition to the logistic regression model, we also estimate a multinomial logistic regression model for 
each course. In the multinomial model, each final grade category (A, B, or C) is a separate outcome. We 
are able compare differences in the association between course resources and the likelihood of 
receiving an A vs. B final grade with the likelihood of receiving a B vs. C final grade within the same 
model. The measures of resource use, covariates, and semester fixed-effects remain the same as in 
model 1. 

4 RESULTS 

4.1 Descriptive Results 

We first describe the distribution of students and grades in courses over all semesters in Table 1. The 
amount of available data is highly variable by course. For example, we have three semesters of data on 
CHEM 202, with only one course section per semester. At the opposite end of the spectrum, we have 
twelve semesters worth of data for ENGR 198, across which there were 81 individual course sections. 
We pooled together data across individual sections in each semester. Proportional to the number of 
semesters of data and sections of each course we examine are the total number of students for which 
we have information. These range from 754 students in CHEM 202 to 9,679 students in CHEM 101. 

A descriptive examination of the grades in each course reveals that the proportion of students receiving 
different types of grades varies across courses. These range from a low of 20.0% of students receiving an 
A across all semesters of CHEM 101 to 54.6% of students in CHEM 202. For the most part, a sizable 
proportion of students receive a C grade in each course, with the exception of CHEM 202 and ENGR 198. 
We noticed that the grade distribution within each course is relatively consistent across semesters, 
enabling us to generalizing our results to future semesters. 
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Table 1. Final Grade Distribution of Students in STEM Courses 


Course 

Semesters 

Course 

Sections 

A, B,& C 
Students 

Students in 
Analyses 

% A Final 

Grade 

% B Final 

Grade 

% C Final 

Grade 

CHEM 101 

10 

10 

9,679 

8,675 

20.0 

57.9 

22.1 

CHEM 201 

5 

5 

1,088 

1,087 

34.4 

43.2 

22.4 

CHEM 202 

3 

3 

754 

751 

54.6 

38.2 

7.2 

ENGR 198 

12 

81 

5,209 

5,197 

45.9 

47.6 

6.5 

ENGR 199 

9 

18 

3,236 

3,230 

52.9 

34.9 

12.1 

PHYS101 

4 

16 

4,823 

4,810 

29.0 

37.6 

33.4 

PHYS 201 

6 

6 

2,054 

1,823 

32.5 

32.6 

34.9 


Final grades represent the percentage of students receiving any version of a given letter grade (e.g., A+, A, A- are "A" students) 
across all semesters within a given course. Students in analyses represent those with no missing data for demographic and 
academic background information (see Appendix Table C.l for full grade distribution). 


We describe the demographic and academic background of students in the analyses by course in Table 
2. Females compose approximately half of the students enrolled in first- and second-year chemistry 
courses but only about one-quarter of the students enrolled in these engineering or physics courses. The 
majority of students across all courses are U.S. citizens. Of the students taking the math placement 
exam, nearly 75% of students in chemistry courses and 90% of students in engineering or physics courses 
placed in the highest math course. Across semesters, students' mean semester GPA less the specific 
STEM course ranged from 3.07 to 3.39, roughly in line with STEM Academy expectations for students. 


Table 2. Demographic and Academic Background of Students in STEM Courses 


Course 

% Female 

Citizenship 

% Perm-US % Non-US 

Resident Resident 

% Placed in 
Calculus 1 + 
for 1st Term 

Mean 

Sem. GPA 
(less course) 

CHEM 101 

44.8 

3.1 

3.4 

72.1 

3.11 

CHEM 201 

51.7 

5.7 

4.4 

78.8 

3.39 

CHEM 202 

53.3 

4.9 

2.0 

76.0 

3.27 

ENGR 198 

25.9 

3.1 

7.3 

93.0 

3.07 

ENGR 199 

26.7 

3.2 

7.6 

89.0 

3.11 

PHYS101 

28.9 

2.7 

6.1 

92.5 

3.26 

PHYS 201 

20.2 

3.2 

6.5 

92.7 

3.28 


We next describe how the pattern of accessing resources in each category varies by final student grade 
across all semesters in Table 3. Instructors of each course will provide a different number of resources in 
the LMS for student use across semesters. These numbers vary highly both within (between each 
resource category) and across courses. The most important information from these descriptive results is 
how the means of resource accesses within any on course or category differ by final course grade. 5 


We observe some large standard deviations of the number of resource accesses, indicative of students accessing the same 
resources multiple times in the LMS (as opposed to downloading once and saving). This factor, along with the changing number 
resources available to students across courses/semesters is why we use the percentile rank to measure student resource use. 
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We see a general trend that the mean number of resource accesses across all courses in the categories 
of Course Information, Exam Preparation, and Assignment (with the exception of CHEM 201) resources 
is highest for A students with B students next and C students having the lowest means. In other words, 
there appears to be a positive association between course resource use and student final grade. There is 
one exception: for the Lecture resources, only about four of the seven courses follow the same positive 
trend between resource use and student final grade as with the other resource types. 


Table 3. Mean Course Resource Hits by Final Grade 



Course Information Resources 

Exam Preparation Resources 

Course 

A Grades 

B Grades 

C Grades 

A Grades 

B Grades 

C Grades 

CHEM 101 

1.6 

1.5 

1.4 

27.5 

24.9 

20.2 


[1.9] 

[1.7] 

[1.6] 

[11.8] 

[11.2] 

[10.6] 

CHEM 201 

13.1 

12.0 

10.6 

19.9 

13.4 

10.1 


[9.8] 

[9.5] 

[9.1] 

[22.5] 

[16.8] 

[13.5] 

CHEM 202 

14.8 

13.4 

12.3 

NA 

NA 

NA 


[9.9] 

[11.0] 

[13.5] 




ENGR 198 

12.6 

10.9 

9.6 

7.9 

6.2 

4.6 


[13.7] 

[11.9] 

[11.3] 

[8.9] 

[7.7] 

[6.6] 

ENGR 199 

13.5 

12.8 

11.3 

22.3 

21.3 

18.0 


[23.8] 

[22.4] 

[16.1] 

[14.6] 

[14.6] 

[14.4] 

PHYS101 

78.2 

75.6 

77.1 

50.9 

50.5 

45.1 


[84.2] 

[71.0] 

[70.0] 

[37.4] 

[33.7] 

[33.2] 

PHYS 201 

20.3 

17.5 

17.1 

30.1 

25.5 

21.5 


[23.4] 

[18.1] 

[18.1] 

[30.3] 

[29.1] 

[24.6] 


Assignment-Related Resources 

Lecture-Related Resources 

Course 

A Grades 

B Grades 

C Grades 

A Grades 

B Grades 

C Grades 

CHEM 101 

3.9 

3.4 

3.1 

9.4 

7.8 

6.9 


[6.3] 

[5.8] 

[5.1] 

[6.3] 

[5.9] 

[5.5] 

CHEM 201 

55.9 

56.6 

57.6 

16.4 

13.0 

9.9 


[58.6] 

[48.7] 

[46.3] 

[18.4] 

[14.9] 

[11.3] 

CHEM 202 

53.6 

52.5 

44.1 

27.6 

32.4 

41.6 


[34.4] 

[29.5] 

[17.1] 

[29.7] 

[29.4] 

[30.0] 

ENGR 198 

58.0 

47.2 

40.5 

41.4 

34.6 

26.6 


[48.9] 

[39.0] 

[33.6] 

[37.4] 

[33.9] 

[26.4] 

ENGR 199 

125.0 

109.4 

106.4 

76.0 

76.3 

79.6 


[69.9] 

[75.0] 

[72.3] 

[46.4] 

[47.0] 

[50.8] 

PHYS101 

15.1 

12.2 

9.5 

220.4 

175.9 

141.2 


[27.6] 

[23.1] 

[19.7] 

[488.1] 

[337.1] 

[292.0] 

PHYS 201 

NA 

NA 

NA 

31.3 

24.3 

27.5 





[53.6] 

[45.1] 

[49.8] 

Standard deviations in brackets. We denote where there are no resources in 

a category within a given course as 

"NA." 


The current version of Student Explorer already uses percentile ranks for LMS course website page views because academic 
advisors are easily able to make sense of which students are accessing the course site frequently versus those who are not. 
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4.2 Logistic and Multinomial Logistic Regression Results 

In Table 4, we display the estimates of the associations between course resource use by category and 
final grades for our logistic (A/B vs. C) and multinomial logistic (A vs. B and B vs. C) regression models. 


Table 4. Logistic and Multinomial Logistic Regression Results 



Course Information Resources 

Exam Preparation Resources 

Course 

A/B vs. C 

A vs. B 

B vs. C 

A/B vs. C 

A vs. B 

B vs. C 

CHEM 101 

0.996*** 

0.999 

0.998 

1.014*** 

1.005*** 

1.017*** 

CHEM 201 

1.017*** 

1.006* 

1.009** 

1.004 

1.011*** 

1.003 

CHEM 202 

1.003 

1.006 

1.007 

NA 

NA 

NA 

ENGR 198 

1.000 

0.999 

1.001 

1.007*** 

1.007*** 

1.008** 

ENGR 199 

0.997 

0.996* 

0.999 

1.006*** 

1.003- 

1.011*** 

PHYS101 

0.994*** 

0.998 

0.994*** 

1.005*** 

0.998 

1.003* 

PHYS 201 

0.996~ 

0.998 

0.996 

1.012*** 

1.003 

1.011** 


Assignment-Related Resources 

Lecture-Related Resources 

Course 

A/B vs. C 

A vs. B 

B vs. C 

A/B vs. C 

A vs. B 

B vs. C 

CHEM 101 

0.998 

0.999 

0.993*** 

1.000 

1.004*** 

0.998 

CHEM 201 

0.982* 

0.984* 

0.987 

0.999 

0.999 

1.005 

CHEM 202 

1.023*** 

1.003 

1.011 

0.978*** 

0.991** 

0.983** 

ENGR 198 

1.001 

1.002 

0.998 

1.001 

1.002 

1.004 

ENGR 199 

0.998 

1.004** 

0.995* 

0 992*** 

0 992*** 

0.994* 

PHYS101 

1.007*** 

1.010*** 

1.005** 

0 995* * * 

0 995* * * 

0.999 

PHYS 201 

NA 

NA 

NA 

0.985*** 

1.000 

0.986*** 


p<0.100, *p<0.050, **p<0.010, ***p<0.001. Results for the logistic regression interpreted as the odds of receiving either an A or 
B final grade versus a C for each one-percentile point increase in rank relative to peers within each resource category in a given 
course. Results for the multinomial logistic regression interpreted in a similar manner, separately comparing the odds of 
receiving an A versus B or a B versus C final grade. In both models, we control for each student's sex, citizenship, first-term math 
placement, semester GPA, and a set of semester fixed effects. We denote where there are no resources in a category in a course 
as "NA." We estimated separate models for each course. 


There is a positive and statistically significant (p<0.05) association between increased use of exam 
preparation resources and final student grades in five of the seven courses. This means that as a student 
uses the exam preparation resources in the LMS to a greater degree compared to their peers in the 
course, we predict that a student is more likely to receive an A or B grade as opposed to in C. The 
associations are greater for distinguishing between B and C students as opposed to A and B students, 
though the differences between A and B students are either positive or null. CHEM 202 does not have 
any exam preparation resources and in CHEM 201, the association is positive but not statistically 
significant. 

For the lecture-related resources, there is a negative and statistically significant (p < 0.05) association 
between resource use and final grades in four of the seven courses. The association is null in the 
remaining three courses. In other words, for students who use lecture-related resources to a greater 
degree, we predict that they are more likely to earn a C final grade as opposed to an A or B. For the 
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course information and assignment-related resources, we find more of a mixed pattern regarding the 
association between the greater use of resources and grades. The use of assignment-related resources 
distinguishes between A or B students and C students in CHEM 202, where there are no exam 
preparation resources. 

The student demographic and academic background covariates serve as controls in our model to obtain 
less-biased estimates of the associations between course resources and final grades. Therefore, we do 
not focus on the estimates of these covariates and instead provide the full model estimates in Appendix 
Tables C.2 and C.3. 6 More importantly, the magnitude and statistical significance of associations 
between exam preparation resources and final grades did not substantially change between models 
where we excluded and included the covariates (results available from authors upon request). 

At first glance, these estimates appear small. For example, the estimate of exam preparation resources 
on grades in CHEM 101 suggests that for each one-percentile increase (relative to peers in the course) in 
the use of exam preparation resources, a student is 1.014 times as likely to receive an A or B instead of a 
C. Though the point estimate is small, it is important to note that this only reflects a one-percentile point 
increase in the use of exam preparation resources relative to peers. Substantial changes in a student's 
use of resources relative to their peers (e.g., improving by 10 percentile points) would result in the odds- 
ratio slope estimate changing by percentile change's power (e.g., 1.014 10 , or 1.149 times as likely to 
receive an A or B vs. a C for each 10 percentile increase in using exam preparation resources). 7 

5 DISCUSSION 

5.1 Summary and Implications 

In our first research question, we asked, "What is the association between students' use of four types of 
course LMS resources and the likelihood that a student receives a final course grade of A or B in a core 
STEM course versus a C?" The results in this study indicate that increased use of exam-preparation 
resources is positively associated with final course grades across semesters when accounting for student 
demographic and academic covariates and semester fixed effects. In contrast, increased use of lecture- 
related resources is negatively associated with final course grades. For course information and 
assignment-related resources, we found mixed (some positive, negative, and null) associations. 

In our second research question, we asked, "Are there similarities or differences in the associations 
between students' course resource use based on type and final course grades across multiple courses?" 
We found that the positive association between exam-preparation resources and final course grades 


6 On average, females are less likely to earn an A or B (particularly an A) rather than a C than males; citizenship is not associated 
with grades, with the exception of non-US residents scoring higher in physics courses; higher semester GPAs are positively 
associated with final grade; and students not taking Calculus I or higher as their first-term math course term receive lower 
course grades. 

7 This is due to the exponential nature of reporting odds-ratios from the results of logistic regression. 


ISSN 1929-7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution - NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) 


277 



JOURNAL OF LEARNING ANALYTICS 


S 3LAR 

SOCIETY for LEARNING 
ANALYTICS RESEARCH 

(2016). Improving early warning systems with categorized course resource usage. Journal of Learning Analytics, 3(3), 263-290. 
http://dx.doi.org/10.18608/jla.2016.33.13 

held true in five of the six courses that used exam preparation materials. We observed the negative 
association between lecture-related resources and final grades in four of the seven courses (null in the 
other three). 

Expanding on Macfadyen & Dawson's (2010) early paper demonstrating that types of LMS activity are 
related to student course performance, our results provide information about the value of specific types 
of content that instructors make available to their students. This is particularly useful for large 
introductory courses where instructors do not typically utilize the interactive LMS features identified by 
Macfadyen and Dawson. Specifically, our data show that increased use of resources designed to support 
exam preparation is an important factor in student success in these seven core STEM courses. The 
framework we used to categorize resources allows for instructor variability in the specific nature of the 
resources within categories, while still affording the opportunity to investigate multiple courses and 
verify the results across a set of courses. We can therefore hypothesize beyond our specific data set that 
the use of resources related to exam preparation would be important across the majority of the core 
first- and second-year STEM courses. This finding is likely to be unsurprising — but reassuring — to 
instructors who take the time to provide these type of resources on their class sites. 

While many studies of EWSs are focused on student retention (Jayaprakash et al., 2014), in this study we 
go beyond a binary pass/no-pass analysis (e.g., Fritz, 2016) to demonstrate important differences in LMS 
use between average and high achieving students. As most students in this university pass these core 
classes — indeed there were so few students with failing grades that we excluded them from our 
analysis — we were able to identify which kinds of LMS content were utilized more by the students with 
the highest grades. This is particularly important, as even high achieving students in these core courses 
are likely to experience a lower than expected grade (McKay, Miller, & Tritz, 2012) and subsequently use 
these grades in the courses to make decisions about whether to major in a particular field (Rask, 2010). 
Given that attrition in STEM majors is highest in the first or second year of university (Seymour & Hewitt, 
1997), it is important that we help students be successful at levels to which they aspire, as 
dissatisfaction with academic performance is a likely factor in the decision to leave a STEM major. 

For academic advisors, the visibility of resource use provided by the EWS allows them to identify how 
and when variability in "studenting skills" may need to be the focus of their intervention in order to 
promote successful student behaviours (Griffin, McGaw, & Care, 2012). We can better illustrate how 
these results can influence advisor conversations with students when looking at performance through 
Student Explorer by considering a scenario of a hypothetical student. Several weeks into the course, the 
advisor observes this student has an overall course grade on the margin between a B and C. This 
hypothetical student is also at the 25 th percentile (bottom quartile) of resource use compared to their 
classmates. As an actionable intervention, the advisor could suggest using more of the course's LMS 
resources. In particular, they may focus on resource categories historically shown to be associated with 
the likelihood of receiving a higher final grade, such as exam preparation resources. 
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We can use the results of our models to estimate how the student's likelihood of receiving a certain final 
grade would change if they increased their use of exam preparation resources. In Table 5 below, we 
present the change in likelihood of receiving a final grade if a hypothetical student were to change their 
use of exam resources from the 25 th percentile (bottom quartile) to the 50 th percentile (median). 


Table 5. Changing Exam Preparation Resource Use from 25 th to 50 th Percentile 


Course 

A/B vs. C 

A vs. B 

B vs. C 

CHEM 101 

1.413 

1.125 

1.529 

CHEM 201 

NS 

1.309 

NS 

CHEM 202 

NA 

NA 

NA 

ENGR 198 

1.205 

1.181 

1.213 

ENGR 199 

1.159 

NS 

1.322 

PHYS101 

1.127 

NS 

1.086 

PHYS 201 

1.357 

NS 

1.320 


Results interpreted as the odds of receiving a higher final grade for a 
student that increases their exam preparation resource use from the 25 th 
percentile to the 50 th percentile (median) in the class within a given 
semester. NA (not available) and NS (not significant at 5% from the model 
results) also displayed. 


We predict that a student who begins using exam preparation resources to a greater degree will be 
anywhere from 1.13 to 1.41 times (based on course) as likely to receive an A or B as a C in the course 
than if they did not change their resource use patterns. On the margin of receiving a B or a C, or the 
margin of receiving a "desirable" vs. "undesirable" outcome, the predicted increase in the likelihood of 
receiving a B ranges from 1.09 to 1.53 times. These are meaningful patterns and estimates, whereby a 
small, yet targeted change in a student's behaviour could improve the likelihood of performing as 
desired in a course by 9 to 53%. As discussed above, achieving a B versus C grade (or even an A versus B 
grade) may determine a student's decision about whether to continue in STEM, particularly for women 
and underrepresented minority students who are most likely to experience stereotype threat in these 
courses (Nguyen, & Ryan, 2008). 

Another important finding is that these results held across semesters. Although the overall content 
structure within a given entry-level course remains largely the same across semesters, the instructors 
and resources available to students may change with each offering. We would expect some content and 
resource changes to continue over time. However, an increase in the use of student-centred pedagogies 
in introductory STEM courses that rely less on a few high-stakes exams as the main determinant for the 
course grade may necessitate a shift away from the importance of using exam-preparation resources 
(e.g., Towns & Grant, 1997). At first glance, changes across semesters to resources included in the LMS 
would seem problematic for incorporating resource use in an individualized, course-by-course manner 
into an EWS. However, our framework categorizes resources across courses, suggesting that system 
developers should incorporate simple mechanisms for instructors to designate category labels to each 
resource added to the course site. Making such labels visible to students could also help them to 
understand why and when to attend to various resources available on their course websites. 
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For the development of Student Explorer specifically, the categorization of resources allows developers 
to include data on specific resource use into the classification system across all STEM courses. Doing so 
would replace the count of general LMS site page views in the EWS algorithm. Specifically, perhaps most 
important for the application of these results to the STEM Academy is that students on the margin of 
receiving a "B" versus a "C" in the course can be distinguished by their use of exam preparation course 
resources. If the Student Explorer algorithm incorporated information about resource use, advisors 
could give students more targeted feedback to improve their habits, and potentially in turn, improve 
their grades. While the student's final grade is not the sole indicator of learning and various types of 
resources may support other aspects of knowledge building, the focus on course grade here reflects the 
STEM Academy's goal for all of their students to achieve at least a 3.0 overall grade point average. 

5.2 Limitations 

To build on this work, we need to be mindful of other approaches for aggregating course resource usage 
data provided by LMS event logs. Using percentile ranks allows for peer-to-peer comparisons and 
protects against outliers, but is not a perfect metric. Some students may use the resources as effectively 
as their peers but do so by downloading and saving the resource to their hard drives, which would 
represent one access in the event log without showing how many times they used that resource after 
the download. We will need to consider alternative metrics, including the proportion of materials used 
within each resource category, as well as a weighting scheme to reduce the influence of the number of 
accesses. In addition, not all courses may use materials falling directly into our resource categories (e.g., 
studio-based design courses), so working with faculty in non-STEM and advanced courses across 
disciplines will reveal ways to make our categorization scheme flexible while still supporting the success 
of this approach across different types of courses. As core STEM courses move away from primarily 
lecture-based instruction with multiple-choice exam assessment (Deslauriers, Schelew, & Wieman, 
2011), use of interactive features in the LMS may play a more prominent role in predicting student 
success (Macfadyen & Dawson, 2010). 

Our next steps involve expanding this analysis to refine our understanding of the relationship between 
course resource use and a student's developing grade over the course of a semester. For example, our 
results also suggest a negative association between the use of lecture-related resources and achieving 
higher grades. Because many types of lecture-related materials were captured as a single category (see 
Appendix B) we are unable to account for exactly why we observed this association. Some lecture- 
related resources included video and audio recordings of the lectures as well the lecture overhead 
slides. It may be that some students are solely utilizing these materials instead of coming to class or 
perhaps focusing on reviewing lecture materials instead of testing their own knowledge with the exam 
preparation materials. Research on the impact of using lecture recordings in higher education is mixed, 
although there is some evidence that class attendance is an important mitigating factor (O'Callaghan, 
Neumann, Jones, & Creed, 2015). Unfortunately, course attendance is not captured by any of our 
campus systems, so we cannot add this variable into our analysis at this time. 
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In our future work we will also need to examine how the association between resource use and grades 
changes during the semester (i.e., before and after exams), in order to better focus advisors' 
recommendations to students. For example, we may find that students who use exam preparation 
resources after failing a test may show improved performance on the next test. We also plan to examine 
whether or not including course resource use into Student Explorer along with additional demographic 
or background data allows us to better classify at risk students. Linking students' online activity to other 
student data sources held by the university (e.g., admissions, registration, and financial aid data) 
presents new opportunities for predicting and intervening in student outcomes. 

Finally, we note that aspects of our EWS and performance measures are specific to the context in which 
we conducted this study: a large, public, residential research university in the United States. Accordingly, 
future EWS research will need to continue to accommodate the specific context in which the courses are 
offered, considering both cultural and institutional factors related to practices in higher education. 

6 CONCLUSION 

In this study we have taken a specific student behaviour — use of different types of resources available 
in course LMSs — and generalized it across a series of courses representing typical introductory STEM 
courses offered at a large, residential research university in the United States. The results demonstrate 
that such resources often have explanatory power in predicting student success, particularly when the 
goal is to achieve at least a B grade in each course. Whether EWSs or other systems provide this 
information to advisors, instructors, or students, the consistency of associations we observe here is 
powerful for designing specific recommendations for helping students to understand how to succeed in 
these courses. 

Instructors have long sought information about student use of the course resources that the instructor 
has curated and/or produced, particularly when instructors consider these resources to be valuable and 
take additional effort to create them. This study demonstrates that such resources can lead to student 
success in introductory STEM courses. We did not design our study to prescribe which kind of specific 
resources instructors should or should not include on their course sites, but rather to reveal how 
leveraging a resource's metadata thus becomes a powerful component to help describe and understand 
students' learning patterns and behaviours. Further research is needed to identify additional actionable 
data sources for building accurate and scalable Early Warning Systems and to better understand how 
such data can be best presented so that users of the data displays — advisors, instructors, and students 
— can take those appropriate actions. The question for designers and researchers of LA-based 
interventions is whether the data structures already available in logs generated by LMSs and other 
forms of technology-enabled learning allows for the investigation and definition of these associations so 
that targeted interventions can be implemented without imposing a unique structure for every course 
(or groups of courses) within a university. 
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Appendix A. Description of Core STEM Courses Included in Analyses 


Course 

Description 

CHEM 101 

General Chemistry Lab 1. A "course designed around student interdependence and inter-group 
collaboration" where "students perform chemistry experiments in a group learning 
environment." The primary objectives of the course include encouraging scientific and critical 
thinking through teamwork, experiencing how experimental results demonstrate various 
chemical principles, and engaging students in the process of using the scientific method and 
reasoning. These objectives suggest that success in the course will be central to success in future 
STEM courses. 

CHEM 201 

Structure and Relativity 2. Continuation of the introduction to organic chemistry course. The 
course must be taken concurrently with CHEM 202. Students get further practice in applying the 
major concepts of chemistry to predicting the physical and chemical properties of organic 
compounds, including macromolecules, both synthetic and biological. Course exams will test 
students' ability to project and apply the broad concepts to new and unfamiliar situations. 

CHEM 202 
(Lab) 

Synthesis and Characterization of Organic Compounds. Students participate in planning exactly 
what they are going to do in the laboratory by being given general goals and directions that have 
to be adapted to fit the specific project they will be working on. They use microscale equipment, 
which requires them to develop manual dexterity and care in working in the laboratory. They also 
evaluate the results of their experiments by checking for identity and purity using various 
chromatographic and spectroscopic methods. Must be taken concurrently with CHEM 201. 

ENGR198 
(Project) 

Introduction to Engineering. Focused team projects dealing with technical, economic, safety, 
environmental, and social aspects of a real-world engineering problem. Written, oral, and visual 
communication required within the engineering profession; reporting on the team engineering 
projects. The role of the engineer in society; engineering ethics. Organization and skills for 
effective teams. 

ENGR199 

Introduction to Computers and Programming. This course introduces first-year students to the 
concept of an algorithm: a well-defined set of instructions that achieve a particular goal. 
Constructing an algorithm for a given purpose is a fundamental form of engineering design task, 
and developing computer programs is part of almost every modern engineering project. Students 
learn how to conceptualize algorithms for solving engineering problems and express them in the 
programming languages MATLAB and C++. 

PHYS101 

General Physics 1. This course offers an introduction to classical mechanics, the physics of motion. 
Topics include: vectors, linear motion, projectiles, relative velocity and acceleration, circular 
motion, Newton's laws, particle dynamics, work and energy, linear momentum, torque, angular 
momentum, gravitation, planetary motion, fluid statics and dynamics, simple harmonic motion, 
waves and sound. Should be taken concurrently with the corresponding lab course. 

PHYS 201 

General Physics 2. This course covers topics in electricity and magnetism: charge, Coulomb's law, 
electric fields, Gauss' law, electric potential, capacitors and dielectrics, current and resistance, 

EMF and circuits, magnetic fields, Biot-Savart law, Amperes law, Faraday's Law of Induction, and 
simple AC circuits. Should be taken concurrently with the corresponding lab course. 
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Appendix B. Classification of Course Resources 

We completed the process of categorizing course resources by labelling each resource manually. We 
used a probabilistic model to guide this process, which suggested the most likely resource category 
based on the filename. Initially, we built this model based categorizing resources from a previous study 
on CHEM 101 (Waddington & Nam, 2014). We first separated file names from each course's site by 
spaces, punctuation, or any other non-alpha-numeric characters to build a course resources corpus. We 
used the frequencies of corpus elements to train the probability model to infer the resource category of 
given file name. Then, we applied this model to another course's resource list and manually corrected 
any errors. We then reused the corrected results as a training set to update the model and applied the 
model to predict the resource category of another course's unlabelled resource list. We conducted this 
iterative process for each course, which in the end saved the amount of time needed to label the vast 
amount of course resources manually that differ across courses and semesters. In Table B.I., we display 
examples of the types of resources within each resource category. 


Table B.l. Categories of LMS Course Resources 


Category 

Examples of Resources 

Course Information 

Schedules, Course Website, Announcements, Syllabus, Instructor Information, 
Course Grades 

Lecture Materials 

Lecture Notes, Discussion Tools, General Resources, Online Learning 

Resources, Lecture Audio Recordings, Cross Discipline Learning Objects 

Assignments 

Experiments, Pre-labs, Team Assignments, Team Report Forms, Discussion 

Exam Preparation 

Sample Exams, Exam Review 
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Appendix C. Additional Tables of Results 

Table C.l. Full Final Grade Distribution of Students in STEM Courses 


Percentage across All Semesters 


Course 

Total 

Students 

Excluded 

Students 

A Final 

Grade 

B Final 

Grade 

C Final 

Grade 

D Final 

Grade 

F Final 

Grade 

W/l Final 
Grade 

CHEM 101 

9,958 

279 

19.8 

56.1 

21.3 

0.5 

0.0 

1.6 

CHEM 201 

1,241 

153 

30.1 

38.0 

19.6 

4.2 

0.0 

4.2 

CHEM 202 

767 

13 

53.7 

37.5 

7.0 

0.1 

0.0 

1.3 

ENGR 198 

5,271 

62 

44.8 

46.9 

6.5 

0.6 

0.0 

0.9 

ENGR 199 

3,470 

234 

49.5 

32.6 

11.4 

2.0 

0.0 

2.6 

PHYS101 

5,481 

658 

25.5 

33.1 

29.3 

4.2 

0.0 

1.7 

PHYS 201 

2,249 

195 

29.4 

30.2 

31.7 

3.6 

0.0 

3.0 


Total students represent the number of students enrolled in the course across semesters as per registrar records. Excluded 
students represent those students with grade E or Pass/Fail cases. Final grades represent the percentage of students receiving 
any version of a given letter grade (e.g., A+, A, A- are "A" students). 


Table C.2. Full Results from Logistic Regression Models 


Variables 

CHEM 

101 

CHEM 

201 

CHEM 

202 

ENGR 

198 

ENGR 

199 

PHYS 

101 

PHYS 

201 

Course Info Res. 

0.996*** 

1.017*** 

1.003 

1.000 

0.997 

0.994*** 

0.996- 

Exam Prep Res. 

1.014*** 

1.004 

NA 

1.007*** 

1.006** 

1.005*** 

1.012*** 

Assignment Res. 

0.998 

0.982* 

1.023** 

1.001 

0.998 

1.007*** 

NA 

Lecture Res. 

1.000 

0.999 

0.978*** 

1.001 

0 992*** 

0 995*** 

0.985*** 

Female 

0.873** 

0.677* 

1.207 

1.314* 

0.821- 

0.560* 

0.712* 

Perm. Resident 

1.216 

1.041 

1.166 

0.786 

1.182 

1.040 

0.893 

Non-Resident 

1.021 

0.861 

1.762 

0.517*** 

0.895 

3.084*** 

3.542*** 

Semester GPA 

10.311*** 

6.773*** 

8.235*** 

4.644*** 

6.269*** 

10.460*** 

16.373*** 

No Calc 1+ Place. 

0.673*** 

0.570** 

0.593~ 

0.464*** 

0.572*** 

0.509*** 

0.425*** 

No Place. Test 

0.453*** 

0.387** 

0.388* 

0.357** 

0.078*** 

0.239*** 

NA 

~p<0-100, * p <0.050. 

**p<0.010, ***p<0.001. Results for the logistic regression 

interpreted as the odds of receiving a final 


grade of A versus B versus C for each variable. We denote where there are no resources in a category in a course as "NA." We 
estimated separate models for each course. 
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Table C.3. Full Results from Multinomial Logistic Regression Models 


A vs. B Final Grades 

Variables 

CHEM 

CHEM 

CHEM 

ENGR 

ENGR 

PHYS 

PHYS 

101 

201 

202 

198 

199 

101 

201 

Course Info Res. 

0.998 

1.006* 

1.006 

0.999 

0.996* 

0.998 

0.999 

Exam Prep Res. 

1.005*** 

1 . 011 *** 

NA 

1.007*** 

1.003- 

0.998 

1.002 

Assignment Res. 

0.999 

0.984* 

1.003 

1.002 

1.004** 

1 . 010 *** 

NA 

Lecture Res. 

1.004*** 

0.999 

0.991** 

1.002 

0 992*** 

0 995 *** 

0.999 

Female 

1.178** 

0.613** 

0.928 

1.667*** 

0.683*** 

0.623*** 

0.539*** 

Perm. Resident 

1.282 

0.578 

1.084 

0.950 

0.715 

1.013 

1.561 

Non-Resident 

1.743*** 

1.229 

1.021 

0.810~ 

1.535* 

3.788*** 

3.778*** 

Semester GPA 

21.667*** 

14.400*** 

8.903*** 

4.452*** 

7.239*** 

14.294*** 

15.493*** 

No Calc 1+ Place. 

0.710*** 

0.616* 

0.605* 

0.573*** 

0.593*** 

0.599* 

1.109 

No Place. Test 

0.515*** 

0.243* 

0.386* 

0.247* 

0.204* 

0.228 

NA 

B vs. C Final Grades 

Variables 

CHEM 

CHEM 

CHEM 

ENGR 

ENGR 

PHYS 

PHYS 

101 

201 

202 

198 

199 

101 

201 

Course Info Res. 

0.998 

1.009** 

1.007 

1.001 

0.999 

0.994*** 

0.997 

Exam Prep Res. 

1.017*** 

1.003 

NA 

1.008** 

1 . 011 *** 

1.003* 

1 . 010 ** 

Assignment Res. 

0.994*** 

0.987 

1.011 

0.998 

0.995* 

1.005** 

NA 

Lecture Res. 

0.998 

1.005 

0.983*** 

1.004 

0.994* 

0.999 

0.988*** 

Female 

0.841** 

0.836 

1.129 

1.201 

0.971 

0.584*** 

0.793 

Perm. Resident 

1.221 

1.103 

0.682 

0.753 

1.314 

1.046 

0.560- 

Non-Resident 

1.161 

0.804 

NA 

0.488*** 

0.663 

2.014*** 

2.247** 

Semester GPA 

6 . 120 *** 

4.728*** 

3.942*** 

3.184*** 

2.793*** 

5.185*** 

5.547*** 

No Calc 1+ Place. 

0 719*** 

0.521*** 

0.747 

0.568*** 

0.682* 

0.521*** 

0.441*** 

No Place. Test 

0.534*** 

0.367*** 

0.760 

0.330*** 

0.167*** 

0.498 

NA 

~p<0-100, * p <0.050. 

**p<0.010, ***p<0.001. Results for the 

multinomial logistic regression 

interpreted as 

the odds of 


receiving a final grade of A versus B separately from receiving a final grade of B versus C for each variable. We denote where 
there are no resources in a category in a course as "NA." We estimated separate models for each course. 
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Table C.4. Full Results from Multinomial Logistic Regression Models (D & F Students Included) 


A vs. B Final Grades 

Variables 

CHEM 

CHEM 

CHEM 

ENGR 

ENGR 

PHYS 

PHYS 

101 

201 

202 

198 

199 

101 

201 

Course Info Res. 

0.998 

1.006* 

1.006 

0.999 

0.996* 

0.998 

0.998 

Exam Prep Res. 

1.005*** 

1 . 011 *** 

NA 

1.007*** 

1.003- 

0.998 

1.003 

Assignment Res. 

0.999 

0.984* 

1.003 

1.002 

1.005** 

1 . 010 *** 

NA 

Lecture Res. 

1.004*** 

0.999 

0.991** 

1.002 

0 992*** 

0 995 *** 

1.000 

Female 

1.226** 

0.612** 

0.928 

1.667*** 

0.684*** 

0.617*** 

0.543*** 

Perm. Resident 

1.155 

0.583 

1.083 

0.933 

0.737 

1.002 

1.724 

Non-Resident 

1.651** 

1.203 

1.021 

0.813~ 

1.534* 

3.799*** 

3.848*** 

Semester GPA 

23.379*** 

14.422*** 

8.903*** 

4.452*** 

7.138*** 

13.949*** 

14.919*** 

No Calc 1+ Place. 

0.714*** 

0.623* 

0.605* 

0.571*** 

0 592 *** 

0.607* 

1.015 

No Place. Test 

0.490*** 

0.244* 

0.386* 

0.248* 

0.206* 

0.232 

NA 

B vs. C, D, or F Final Grades 

Variables 

CHEM 

CHEM 

CHEM 

ENGR 

ENGR 

PHYS 

PHYS 

101 

201 

202 

198 

199 

101 

201 

Course Info Res. 

0.998~ 

1 .010** 

1.007 

1.001 

1.000 

0.994*** 

0.996 

Exam Prep Res. 

1.017*** 

1.002 

NA 

1.008** 

1.013*** 

1.004* 

1 .012*** 

Assignment Res. 

0.993*** 

0.989 

1.011 

0.999 

0.995* 

1.005** 

NA 

Lecture Res. 

0.998 

1.004 

0.983*** 

1.004 

0.995* 

0.998 

0.985*** 

Female 

0.855* 

0.805 

1.129 

1.255 

0.987 

0.569*** 

0.754 

Perm. Resident 

1.124 

1.121 

0.682 

0.675 

1.199 

1.123 

0.559- 

Non-Resident 

1.130 

0.750 

NA 

0.447*** 

0.701 

1.791** 

2.631** 

Semester GPA 

6.382*** 

5.014*** 

3 . 944 *** 

3.307*** 

3.045*** 

6.046*** 

6.042*** 

No Calc 1+ Place. 

0.683*** 

0.511*** 

0.747 

0.596*** 

0.714* 

0.512*** 

0.425*** 

No Place. Test 

0.516*** 

0.384** 

0.759 

0.339** 

0.165*** 

0.487 

NA 

~p< 0 - 100 , * p <0.050. 

**p<0.010, ***p<0.001. Results for the 

multinomial logistic regression 

interpreted as 

the odds of 


receiving a final grade of A versus B separately from receiving a final grade of B versus C, D, or F for each variable. We denote 
where there are no resources in a category in a course as "NA." We estimated separate models for each course. 
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